## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
This tidy data set contains 1,599 red wines with 11 variables on the chemical properties of the wine. At least 3 wine experts rated the quality of each wine, providing a rating between 0 (very bad) and 10 (very excellent).
## 'data.frame': 1599 obs. of 13 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ fixed.acidity : num 7.4 7.8 7.8 11.2 7.4 7.4 7.9 7.3 7.8 7.5 ...
## $ volatile.acidity : num 0.7 0.88 0.76 0.28 0.7 0.66 0.6 0.65 0.58 0.5 ...
## $ citric.acid : num 0 0 0.04 0.56 0 0 0.06 0 0.02 0.36 ...
## $ residual.sugar : num 1.9 2.6 2.3 1.9 1.9 1.8 1.6 1.2 2 6.1 ...
## $ chlorides : num 0.076 0.098 0.092 0.075 0.076 0.075 0.069 0.065 0.073 0.071 ...
## $ free.sulfur.dioxide : num 11 25 15 17 11 13 15 15 9 17 ...
## $ total.sulfur.dioxide: num 34 67 54 60 34 40 59 21 18 102 ...
## $ density : num 0.998 0.997 0.997 0.998 0.998 ...
## $ pH : num 3.51 3.2 3.26 3.16 3.51 3.51 3.3 3.39 3.36 3.35 ...
## $ sulphates : num 0.56 0.68 0.65 0.58 0.56 0.56 0.46 0.47 0.57 0.8 ...
## $ alcohol : num 9.4 9.8 9.8 9.8 9.4 9.4 9.4 10 9.5 10.5 ...
## $ quality : int 5 5 5 6 5 5 5 7 7 5 ...
## [1] "X" "fixed.acidity" "volatile.acidity"
## [4] "citric.acid" "residual.sugar" "chlorides"
## [7] "free.sulfur.dioxide" "total.sulfur.dioxide" "density"
## [10] "pH" "sulphates" "alcohol"
## [13] "quality"
## X fixed.acidity volatile.acidity citric.acid
## Min. : 1.0 Min. : 4.60 Min. :0.1200 Min. :0.000
## 1st Qu.: 400.5 1st Qu.: 7.10 1st Qu.:0.3900 1st Qu.:0.090
## Median : 800.0 Median : 7.90 Median :0.5200 Median :0.260
## Mean : 800.0 Mean : 8.32 Mean :0.5278 Mean :0.271
## 3rd Qu.:1199.5 3rd Qu.: 9.20 3rd Qu.:0.6400 3rd Qu.:0.420
## Max. :1599.0 Max. :15.90 Max. :1.5800 Max. :1.000
## residual.sugar chlorides free.sulfur.dioxide
## Min. : 0.900 Min. :0.01200 Min. : 1.00
## 1st Qu.: 1.900 1st Qu.:0.07000 1st Qu.: 7.00
## Median : 2.200 Median :0.07900 Median :14.00
## Mean : 2.539 Mean :0.08747 Mean :15.87
## 3rd Qu.: 2.600 3rd Qu.:0.09000 3rd Qu.:21.00
## Max. :15.500 Max. :0.61100 Max. :72.00
## total.sulfur.dioxide density pH sulphates
## Min. : 6.00 Min. :0.9901 Min. :2.740 Min. :0.3300
## 1st Qu.: 22.00 1st Qu.:0.9956 1st Qu.:3.210 1st Qu.:0.5500
## Median : 38.00 Median :0.9968 Median :3.310 Median :0.6200
## Mean : 46.47 Mean :0.9967 Mean :3.311 Mean :0.6581
## 3rd Qu.: 62.00 3rd Qu.:0.9978 3rd Qu.:3.400 3rd Qu.:0.7300
## Max. :289.00 Max. :1.0037 Max. :4.010 Max. :2.0000
## alcohol quality
## Min. : 8.40 Min. :3.000
## 1st Qu.: 9.50 1st Qu.:5.000
## Median :10.20 Median :6.000
## Mean :10.42 Mean :5.636
## 3rd Qu.:11.10 3rd Qu.:6.000
## Max. :14.90 Max. :8.000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 3.000 5.000 6.000 5.636 6.000 8.000
We noticed the red win (qualty) is normaly distrbuted and mean around 5.0
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 8.40 9.50 10.20 10.42 11.10 14.90
We noticed the red win (alcohol) is a right skewed and mean around 10
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 6.00 22.00 38.00 46.47 62.00 289.00
We noticed the red win (total.sulfur.dioxide) is a right skewed and mean around 46
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.60 7.10 7.90 8.32 9.20 15.90
We notice the red win (fixed.acidity) is a right skewed and mean around 8
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1200 0.3900 0.5200 0.5278 0.6400 1.5800
We noticed the red win (volatile.acidity) is a right skewed and mean around 0.25 and there is a few outlier
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 0.090 0.260 0.271 0.420 1.000
We notice the red win (citric.acid) there is one oulier is “1” and mean around 0.27
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.900 1.900 2.200 2.539 2.600 15.500
We notice the red win (residual.sugar) is a right skewed and mean around 2.53
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.01200 0.07000 0.07900 0.08747 0.09000 0.61100
We notice the red win (chlorides) there is many outliers and mean around 0.08
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.9901 0.9956 0.9968 0.9967 0.9978 1.0037
We notice the red win (density) normal distrubtions and mean around 0.99
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.3300 0.5500 0.6200 0.6581 0.7300 2.0000
We notice the red win (sulphates) is a right skewed and mean around 0.65
## [1] "X" "fixed.acidity" "volatile.acidity"
## [4] "citric.acid" "residual.sugar" "chlorides"
## [7] "free.sulfur.dioxide" "total.sulfur.dioxide" "density"
## [10] "pH" "sulphates" "alcohol"
## [13] "quality"
there are 1599 wines and 13 variables (“X”,“fixed.acidity”,“volatile.acidity”,“citric.acid”
“residual.sugar”,“chlorides”,“free.sulfur.dioxide”,“total.sulfur.dioxide” “density”,“pH”,“sulphates”,“alcohol”,“quality” ) ### What is/are the main feature(s) of interest in your dataset? The main feature is “quality”,because it will help me to decide which product can we predict it’s better and we campare with other factors ### What other features in the dataset do you think will help support your
investigation into your feature(s) of interest? i think density and alcohol ### Did you create any new variables from existing variables in the dataset? No ### Of the features you investigated, were there any unusual distributions?
Did you perform any operations on the data to tidy, adjust, or change the form of the data? If so, why did you do this? No I didin’t’ the data are tidy
As we see after we compared between total.sulfur.dioxide and free.sulfur.dioxide we can see the free.sulfur.dioxide with wines higher when it’s close 0
as we can see once the alcohol getting high percentage it will be good quality
## $title
## [1] "quality and density"
##
## attr(,"class")
## [1] "labels"
As we can see the density it’s not making significant impact on quality
As four graphs above it shows the correletion between 2 variables which we compared. I choosed the chemical features that might be has a significant correlation with wine quality.
as the graphs above we saw many correletion between varieables like ( fixed.acidity and ph ) around 3,3
YES, we found relation between( alcohol and quality) the quality will be good once the quantity of alcohol has more ### What was the strongest relationship you found? alcohol and quality as we mentioned earlier.
As we can see there is relation between quality and alcohol and citric.acid once alcohol increased it shows the quality increased
as we mentioned in graph above when alcohol got higher the quality will increased
No
### Description One The graph shows the quality of wine and we can see almost wines have quality around 5
### Description Two As we see the correlation between( alcohol and quality) are positve
High quality wines when higher alcohol and high pH
The red wines dataset contains 1599 observations with 13 variables. After analyzing and exploring data such as what I expected the alcohol percentage correlated to quality, but I was expect the quality will be good in the higher acidity. I think we need more features to make sure what impacts on wines like (fermentation in terms of the time and the process) it’s diffculty to learn new langyage but i found it easier than python in plot and helpful